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ABSTRACT 

This  thesis  provides  a  logical  design  view  of  the 
session  services  control  layer  of  a  distributed  network  to 
be  used  in  tne  SPLICE  (Stock  Point  Logistics  Integrated 
Communication  Environment)  project.  It  examines  tne 
functional  requirements  of  session  services,  the  data 
necessary  to  provide  that  functionality,  and  the  interfaces 
required.  These  areas  typically  focus  on  the  SPLICE 
application  specifically,  but  apply  to  a  generic  session 
services  as  well. 

The  recommendations  that  are  offered  relate 
specifically  to  the  SPLICE  application  and  address  the 
prospect  of  placing  a  fault  tolerant  capability  in  session 
services  for  SPLICE.  Other  recommendations  are  appropriate 
only  to  the  SPLICE  environment. 
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I .  INTRODUCTION 

Navy  stock  points  and  inventory  control  points  (ICPs) 
are  currently  in  a  situation  which  suggests  tnat  tney  are 
outgrowing  their  current  data  processing  capabilities.  The 
current  hardware  suite  consists  of  medium  sized  Burroughs  B- 
3500/3700/4700/4800  Systems  and  is  becoming  saturated  by  a 
seemingly  endless  demand  for  more  support.  Each  year,  at  an 
annual  users'  conference,  the  list  of  "things  to  do"  grows 
longer.  Eacn  stock  point  is  autonomous  witn  respect  to  how 
it  implements  data  processing  support,  as  long  as  it 
accommodates  the  Navy  Supply  Systems  Command  (NAVSUP) 
requirements.  As  a  result,  various  mini  and  micro  computers 
at  these  facilities  have  used  locally-provided  code  instead 
of  the  standard  code  provided  by  the  Fleet  Material  Support 
Office  (FMSO)  for  the  FMSO-maintained  systems.  FMSO  has  a 
charter  to  provide  system  software  and  application  software 
for  the  existing  data  processing  systems  used  at  stock 
points.  This  system  is  called  the  Uniform  Automated  Data 
Processing  System  -  Stock  Points  (UADPS-SP) . 

Long  range  plans  to  overcome  the  current  situation 
include  the  purchase  of  new  hardware  for  the  stock  points 
and  ICPs.  This  effort  nas  resulted  in  a  contract  with  Tandem 
Computer,  Inc.  to  provide  necessary  hardware.  The  design 
framework  in  which  this  hardware  is  to  be  utilized  is  a 
distributed  Local  Area  Network  (LAN)  architecture.  It  has 
been  given  the  name  of  Stock  Point  Logistics  Integrated 
Communications  Environment  (SPLICE) . 
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SPLICE  is  designed  to  augment  the  existing  data 
processing  environment  at  stock  points.  UADPS-SP,  which  runs 
on  the  Burroughs  Systems  mentioned  above,  is  only  one  of  the 
automated  systems  at  most  stock  points.  There  is  the 
Integrated  Disbursement  and  Accounting  (IDA)  System  which  is 
operating  on  the  Interdata  1/32,  the  Automated  Procurement 
and  Data  Entry  (APADE)  System,  and  the  Trident  Logistics 
Data  System  (Trident  LDS).  Each  has  its  own  data  elements, 
files,  programs,  transactions,  users,  reports,  and  some  have 
additional  hardware.  To  augment  them  all  and  not  force 
redesign  to  existing  systems  is  a  difficult  task.  By  the 
time  that  SPLICE  is  ready  to  be  implemented,  there  will 
surely  be  more  systems  with  which  to  interface. 

Many  of  the  new  application  requirements  involve 
interactive  capabilities  and  telecommunications  support. 
This,  in  general,  is  not  supportable  with  the  current  mix  of 
nardware  and  software. 

The  two  advertised  major  objectives  of  SPLICE  are:  (1) 
to  allow  for  CRT  display  terminals  to  interact  with 
application  logic  and  to  fetch  information  from  a  system 
data  base;  and,  (2)  to  provide  a  standard  interface  for  each 
of  the  sixty-two  supported  supply  sites.  Notice  that  these 
goals  are  stated  in  data  processing  rather  than  functional 
terms . 

The  implementation,  in  general,  is  envisioned  as  having 
mini-computers  used  as  front-ends  to  the  existing  hardware. 
The   interfaces  with  existing  systems  will  be  controlled   by 
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tne   LAN.   The   Burroughs  systems  would  be  aole   to   provide 

larger   file   processing   functions   and   report   generation 

functions   in  a  background  mode.   Each  LAN  is  planned  to   be 

"standardized,"  and  have  the  capability  to  communicate   with 

other   LANs   via  tne  Defense  Data  Network   (DDN),   which   is 

managed  by  the  Defense  Communications  Agency. 

The   functional   objects   which  are   contained   in   the 

SPLICE   network   have   been  identified  at  a   high   level   of 

abstraction    (Scnneidewind  and  Dolk,   Nov  1983),   and  major 

functions  have  been  identified  for  those  objects.   It  is  the 

purpose  of  this  paper  to  decompose  tne  object  called  Session 

Services   into  lower  levels  of  abstraction,   and  to   clarify 

the  functions  at  a  level  which  should  feed  the  design  of  the 

actual  module. 

A  session  can  be  described  as  all  of  tne  activity  which 

takes   place  among  two  or  more  processes  for  the  duration  of 

a   single  task.   The  Session  Services  module  is   the   object 

which  functions  as  the  liaison  between  the  user  and  required 

functional  modules.  When  the  user  specifies  a  required  task, 

the   session   services  object  will  initiate  and  control   the 

required   functional   modules  to   accomplish   tne   requested 

task,   and   return  the  response  and  the  control  to  the  user. 

This   control  mechanism  is  a  complex  one,   primarily  due   to 

the  following  constraints: 

o  A  user  process  may  have  multiple  sessions  active  at 
any  time. 

o  A   functional   module   can  be   active   in   multiple 
sessions  at  a  time. 
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o  Two   or   more  functional  modules  can  be   active   in 
multiple  sessions  at  any  time. 

o  Message    exchange   between   pairs   of    functional 
modules  can  be  nested. 

The  multi -tasking  requirements  above  are  in  addition  to 
message  exchanges  between  local  and  remote  sites,  and  tasKs 
may  be  either  deferrable  or  interactive. 

This  complex  processing  environment  requires  tnat  a 
module  exist  to  control  this  activity.  This  role  has  been 
delegated  to  the  Session  Services  Module. 

To  more  fully  comprenend  the  role  of  Session  Services 
it  may  be  helpful  to  examine  the  layers  of  control  found  in 
a  network  architecture,  of  which  Session  Services  is  a 
member . 


II.  NETWORK  ARCHITECTURE  LEVELS 


A  fundamental  goal  of  the  SPLICE  network  arcmtecture 
is  to  permit  system  designs  that  support  geographic 
distrioution  of  workstations  which  reflect  the  natural 
partitioning  (bacnman,  1978)  of  the  Navy  activities 
involved.  All  application  programs  must  be  independent  from 
the  physical  topology  of  the  nodes  where  tney  reside.  The 
Session  Services  Layer  supports  this  independence. 
Workstation  programs  are  written  to  request  session 
establishments  among  them  using  only  logical  addressing 
names  (mailboxes)  whicn  are  independent  from  physical 
topology.  These  requests  are  all  sent  to  Session  Services. 
The  session  services  module  employs  two  principal  mecnanisms 
(Bachman,  1978) .  The  first  mechanism  is  that  it  links 
processes  into  temporary  cooperative  relat ionsnips  by 
locating  the  desired  partner  process  via  tne  data 
dictionary/directory  system  and  activates  that  process  in  a 
workstation  after  insuring  that  the  partner  is  ready.  An 
approach  for  increasing  the  speed  and  reducing  the  overhead 
by  cutting  hand  shaking  procedures  to  the  bone  is  explained 
in  (Schneidewind  and  Dolk  1983).  The  second  mechanism 
employed  is  the  exchange  of  data  and  status  and  control 
information  over  established  sessions.  This  synchronizes 
cooperating  processes,  the  databases  they  modify  and  the 
journal  entries  of  exchanged  messages  in  support  of  data 
integrity  requirements.  Security,  resource  management,  and 
system   administration   also   have   manifestations   at    the 
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session  service  level,  but  have  not  been  included  as  major 
Session  Services  mechanisms. 

Network  architectures  may  vary  with  regard  to 
implementation  structures  but,  in  general,  tne  functions 
provided  are  the  same.  The  session  services  layer  is  usually 
one  of  several  layers  of  control  identified  witnin  an 
architecture.  Using  a  logical  architecture  model  closely 
related  to  most  commercially  available  arcni tectures  is  a 
useful  way  of  examining  the  entire  control  structure. 

A   layered   approach  to  network  architecture   from   the 

International   Standards   Organization  (ISO)   defines   seven 

distinct  layers  of  protocol.  (Deitel,  1983) 

Physical  Layer  -  This  layer  handles  the  mecnanical   and 

electrical     details    of    the  actual    physical 

transmission   of   bit  sequences  over   communication 
lines . 

Data  Link  Layer  -  Tnis  layer  controls  the  manipulation 
of  data  packets.  It  handles  the  addressing  of 
outgoing  packets  and  the  decoding  of  addresses  on 
incoming  packets.  It  detects  and  possibly  corrects 
errors  that  occur  in  the  Physical  Layer. 

Network   Layer  -  This  layer  controls   the   switching 

and   routing   of   messages  between  stations   in   tne 
network . 

Transport  Layer  -  This  layer  provides  for  transfer  of 
messages  between  end  users.  The  users  need  not  be 
concerned  with  the  manner  in  which  reliable  and  cost 
effective  data  transfers  are  achieved. 

Session  Layer  -  This  layer  provides  the  means  for 
cooperating  processes  to  organize  and  synchronize 
their  dialog  and  manage  their  data  exchange. 

Presentation  Layer  -  This  layer  resolves  differences  in 
formats  among  the  various  computers,  terminals, 
databases,  and  languages  used  in  a  network. 

Application  Layer  -  This  layer  is  the  highest  layer  and 
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provides  services  directly  to  users.  It  deals  with 
data  exactly  as  it  is  generated  by  and  delivered  to 
user  processes.  This  level  contains  the  user- 
specified  functions  and  program  controls. 


The   topology,   media,   and   access  procedures  for   the 
eventual   SPLICE  system  will  not  be  addressed,   as  they   are 
implementation  features.  Instead,  the  emphasis  will  be  placed 
on    design    considerations    which    impact    the    system 
requi rements . 

The  next  section  will  use  hierarchical  decomposition  to 
aid  in  the  definition  of  the  Session  Services  control 
functions . 
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III.  SESSION  SERVICES  FUNCTIONS 

Each  of  the  major  functions  will  oe  decomposed  into 
their  logical  pieces  and  described  in  more  detail. 

The  first  step  is  to  describe  the  functions  that  are 
within  the  scope  of  tne  design.  No  attempt  is  made  to 
constrain  this  step  by  conforming  to  current  organizational 
structure  or  data  processing  systems  in  use.  No  attempt  is 
made  to  determine  tne  likelihood  of  automation.  This 
functional  decomposition  is  not  based  on  organizational 
attributes,  nor  does  the  decomposition  address  control 
issues  at  any  level.  Naming  the  business  functions  as 
accurately  and  consistently  as  possible  helps  users  relate 
to  the  formalized  function  descriptions  during  the 
requirements  review.  At  each  aggregate  level  of  abstraction, 
an  attempt  was  made  to  maintain  a  similar  level  of  detail 
throughout  that  level.  That  is,  the  scope,  size,  magnitude, 
and  relative  importance  of  each  module  should  be 
approximately  equal  at  each  level.  During  the  process  of 
function  definition,  the  next  lower  level  of  functionality 
is  described  in  an  effort  to  validate  tne  level  being 
developed,  and  to  insure  the  ability  to  make  accurate 
tradeoff  decisions  concerning  cohesion  and  coupling  issues 
later.  The  function  description  consists  of  a  unique 
identifying  number,  the  logical  function  name,  a  description 
of  the  function,  and  its  required  input  and  output.  When 
validating  functions  with  respect  to  system  requirements,  it 
is   useful   to  have  a  nierarchical  chart   of   the   functions 
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available  to  cross-reference  the  function  with  its  relative 
position  among  the  functions  as  a  .vnole.  The  data  flow 
diagram  is  also  a  helpful  tool  to  relate  the  flow  of  data 
tnrough  the  different  functions.  Particularly  confusing 
ideas  may  also  be  cnarted  in  an  effort  to  clarify  tne 
particular  environment  or  requirement. 

Tne    following    diagram   snows   the    major    logical 
components  of  the  session  services  module: 

1.0 


SESSION 
SERVICES 

1.1 

1.2 

1.] 

i 

ESTABLISH 
SESSION 

MONITOR 
SESSION 

CLOSE 
SESSION 

1.  3.1 

1.  J. 2 

1.3.3 

TERMINATE 
SESSION 
PROCESS 

ASSIST 
INTERRUPT 
PROCESS 

ASSIST 
RECOVERY 
PROCESS 

1.2.1 

1.2.2 

1.2.  J 

TRAVERSE 
MAP 
STRUCTURE 

ALTER 

MAP 

STRUCTURE 

VERIFY 

CFM 
OUTPUT 

1.1.1 

1.1.2 

1.1.3 

DETERMINE 

MAP 

STRUCTURE 

REJECT 
SESSION 
REQUEST 

INITIATE 
SESSION 
PROCESS 

Table  1 
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Trie  following  descriptions  are  for  high  level 
functions.  Because  eacn  high  level  function  is  subsequently 
oroKen  into  lower  level  functions,  the  descriptions  will 
only  include  a  narrative  describing  the  function  performed. 
Tne  lower  level  functions  will  tnen  oe  described  in  more 
detail  and  inputs  and  outputs  will  be  identified. 


1.0 


SESSION 
SERVICES 


1.1 


ESTABLISH 
SESSION 


1.2 


MONITOR 
SESSION 


~ li. 


CLOSE 
SESSION 


******************************* 

NAME:  SESSION  SERVICES 
IDENTIFIER:  1.0 

Establishes,  monitors,  and  closes  all  sessions  on 
the  system.  It  initiates  and  controls  the  required 
controlling  functional  modules  to  accomplish  the 
requested  task(s),  and  returns  a  response  and/or 
control  to  tne  user  via  the  TM  module  upon  completion. 
This  module  is  capable  of  controlling  multiple  sessions 
per  user . 

****************************** 

NAME:  ESTABLISH  SESSION 
IDENTIFIER:  1.1 

Translates  a  service  request  from  the  Terminal 
Management  Module  (TM)  to  its  required  mapping  of 
functional  modules  to  satisfy  tne  tasK  named  in  tne 
service  request.  The  function  also  activates  the 
initial  Controlling  Functional  Module  (CFM)  and  sets 
session  status  to  indicate  an  active  session,  or 
rejects  tne  session  request  based  on  security,  or  user 
authority  criteria. 
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******************************* 

SfftBIiiWHT0?.  SBSSI0N 

Maintains  communication  with,  and  passes  control 
to  each  CFM  in  accordance  with  the  session  map.  This 
function  identifies  additional  data  required  for  tne 
user  to  complete  a  given  task  based  on  tne  map  of 
functional  modules.  This  function  also  makes  any 
changes  to  the  functional  module  map,  ana  will  verify 
output  from  a  CFM  for  those  tasKS  which  contain  fault 
tolerant  backup  modules. 

a****************************** 

NAME:  CLOSE  SESSION 
IDENTIFIER:  1.3 

Updates  session  status  to  show  result  of  Session 
Complete  Message,  aids  the  system  in  maintaining 
consistency  in  the  session  and/or  system  status  during 
interrupt  processes,  and  aids  the  Recovery  Module  by 
providing  accurate  session  status  information. 

The   following   data  flow  diagram  is  an  example  of   the 

system  activity  that  occurs  at  the  highest  level.   The   data 

flow   diagram  is  most  useful  to  the  reader  as  a  verification 

that  the  system  is  consistent  and  logical.   It  is   sometimes 

easier  to  view  the  system  in  this  chart  form. 
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The  next  section  describes  the  lowest  level  functions. 
At  tnis  level,  tne  modules  will  nave  suggested  logical 
inputs  and  outputs  identified  for  later  mapping  to  a  low 
level  data  flow  diagram.  Inputs  and  outputs  have  oeen 
associated  witn  their  respective  sources  (origins)  ana 
destinations  (target).  The  sources  and  destinations  are 
SPLICE  modules  and  CFMs.  When  either  tne  source  or  the 
destination  is  a  sub-module  within  the  Session  Services 
Module  it  will  be  identified  by  the  unique  identifier  of  the 
sub-module.  To  later  complete  the  logical  design  for  tne 
specific  SPLICE  application,  the  input  and  output  data  must 
be  quantified.  The  expected  level  of  activity  will  help  to 
size  the  system  and  provide  necessary  input  to  design 
decisions  concerning  data  base  structure  as  well  as  module 
determination . 

The   decomposition   that  follows  is  for   the   Establisn 
Session  function. 


1.1 


Establish 
Session 

1.1.1 

1.1.2 

1.1.3 
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****************************** 

NAME:  DETERMINE  MAP  STRUCTURE 

IDENTIFICATION:  1.1.1 

JESCRIPTION:  Tnis  function  determines  tne  correct  map 
structure  based  on  tne  tasK  request  and  user  or  system 
provided  parameters.  The  map  structure  is  a  definition 
of  tne  processes  witnin  a  task  and  tneir  relationships 
to  one  another.  Task  requests  may  be  ciirecceu  to  a 
specific  workstation  mailbox  by  means  of  a  unique 
logical  name.  Tnis  information  is  used  to  ent^r  tne 
data  dictionary/directory  system  and  retrieve  the 
associated  mapping  structure  for  tne  required 
functional  modules. 


INPUT:  Session  Request 
OUTPUT:  Task  Request 


SOURCE/DESTINATION 

Terminal  Management  Module 

Data  Dictionary  Functional 
Module 


****************************** 


NAME:  REJECT  SESSION  REQUEST 

IDENTIFICATION:  1.1.2 

DESCRIPTION:  Tnis  function  receives  the  exception  status 
from  the  DD/DS  and  sends  an  appropriate  response  to  tne 
originator  via  tne  TM. 

SOURCE/DESTINATION 


INPUT:  Task  Rejection 


Data  Dictionary  Functional 
Module 


OUTPUT:  Session  Rejection   Terminal  Management  Module 
******************************* 


NAME:  INITIATE  SESSION  PROCESS 

IDENTIFICATION:  1.1.3 

DESCRIPTION:  This  function  assigns  a  unique  identifier  to 
the  session,  initiates  a  session-data  entry  whicn 
includes  tne  map  structure,  a  timestamp  of  initiation, 
current  controlling  functional  module,  etc.,  flags  the 
controlling  functional  module  as  "active,"  and  passes 
control  via  a  message  to  tne  local  or  remote  CFM.  (Tne 
National   Communication  Module  will  relay  tnis   message 
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over  tne  DDN  if  it  is  destined  for  a  remote  CFM. 

jQUrce/dls  i' i  nat  ion 
INPUT:  Map  Structure 


OUTPUT:  Controlling 
functional 
Moaule  Name 


Data  Dictionary  Functional 
Module 


Local  or  demote  CFM 


The   decomposition   that   follows  is   for   the   Monitor 
Session  function. 


1.2 

Monitor 
Sess ion 

1.2.1 

1.2.2 

— ■— — 

1.2.  3 

Traverse 
Map 
Structure 

Alter 

Map 

Structure 

Ver  i  f y 

CFM 
Output 

A***************************** 


NAME:  TRAVERSE  MAP  STRUCTURE 


IDENTIFICATION 


1.2.1 


DESCRIPTION:  This 
output  text  1 
Controll ing 
message  agai 
CFM  to  be  act 
from  the  cur 
oe  initiated 
map  has  been 
message  and 
parameters  a 
(TM) .  (Grapn 
transi  t ions 
also  receive 
wnich   nave  f 


function  interprets  completion   messages, 

ocation(s),   and/or  parameters  from 

active 

Functional  Modules  (CFMs).   It  chec 

ks   tne 

nst  the  active  map  and  identifies  tne  next 

ivated.   It  removes  the  activation 

pointer 

rent  CFM  and  places  it  on  tne  next 

CFM   to 

by  sending  a  message  to  tne  CFM. 

If   tne 

completely  traversed,   then  a   com 

plet ion 

the   location   of   output   text 

and/or 

re  sent  to  the  Terminal  Management 

Module 

ical   view  of  tne  possible   process 

state 

are   in  Table  1  on  page  21.)   This 

module 

s  abnormal  termination  messages  from   CFMs 

ault  tolerant  backup  modules  and   r 

ep laces 

2U 


the   module   producing   tne  error   with   a   replacement 

.noau  le . 

SOURCE/DESTINATION 


INPUT:    CFM    Completion   Message 


Output  Text  Location 
Control  Parameter 


Controlling 

functional 

Module  (CFM 

CFM 
CFM 


OUTPUT 


Session  Completion  Message 


Terminate  Session 
Process  (1.3.1) 
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PROCESS  STATE  TRANSITIONS 


i/O    COMPEL i ION  OR 
EVENT  COMPLETION 


C 


I/O  WAIT  OR 
EVENT  WAIT 


SUSPEND 


SUSPENDED 

READY 


SUSPENDED 
BLOCKED 


I/O  COMPLETION  OR 
EVENT  COMPLETION 


TABLE  2 
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****************************** 

NAME:  ALTER  MAP  STRUCTURE 

IDENTIFICATION:  1.2.2 

DESCRIPTION:  Tins  function  is  responsible  for  any  alteration 
made  to  toe  functional  map  associated  with  a  task.  Data 
can  oe  received  from  the  active  controlling  functional 
module,  tne  fault  tolerant  output  verification  modulo 
(1.2. J)  or  the  user.  The  data  may  consist  of  calls  co 
modules  that  are  outside  of  tne  normal  map  structure, 
requests  for  additional  information,  or  data  uescrioing 
interference  situations  caused  by  access  of  snared 
resources . 

SOURCE/DESTINATION 

INPUT :FM  Call  Validate  Request  CFM 

User  Message  Terminal  Management 

OUTPUT:  Map  Alteration  Message      Traverse  Map 

Structure  (1.2.1) 
FM  Call  Validation  Response       CFM 


****************************** 

NAME:   VERIFY  CFM  OUTPUT 

IDENTIFIER:  1.2. -> 

DESCRIPTION:  Tnis  function  ve 
Functional  Modules  prior 
from  tne  CFM.  It  compa 
resulted  from  tne  CFM  aga 
standard  tuat  nas  been  d 
output  will  eitner  match 
passed  because  the  module 
output  will  not  be  consid 
nas  failed,  wnen  tne  modu 
data,  tne  name  of  tne  lo 
replacement  is  sent  to 
replacement  module  will  b 


rifles  output  from  Controll 

ing 

to  control  being   transferred 

res  the  output  data   tha 

it 

nas 

inst  a  prescribed  valid 

out 

put 

eveloped  for  the   module 

i 

The 

,   meaning  tnat  control 

can 

be 

arrived  at  valid  data, 

or 

tne 

ered  valid,  meaning  tne 

module 

le  nas  failed  to  provide 

■  va 

lid 

gicai   fault   tolerant 

module 

Alter  Map  Structure   and 

tne 

e  executed. 

SOURCE/DESTINATION 

INPUT:   CFM  Output 


CFM 


OUTPUT:   Replacement  Module  Name    Alter  Map  Structure 

(1.2.2) 
Message  to  User  Terminal  Management 

Error  Notification  CFM 


The  decomposition  that  follows  is  for  the  Close  Session 
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f  unct ion . 


1.  3 


C  LOS  E 
SESSION 

1.3.1 

1.3.2 

1.3.  j 

TERM iNATE 
SESSION 
PROVES  S 

A  S  S  i  S  T 
INTERRUPT 
P  l\OC  LS  L> 

ASSIST 

RECOVERY 
PROCESS 

************************** 


NAME:  TERMINATE  SESSION  PROCESS 


IDENTIFIER:  1.3.1 

DESCRIPTION:  This  function  is  responsible  for  completing  tac 
session-status  data  entry  for  a  completed  map 
structure,  and  signifying  the  end  of  a  session.  Tne 
"active"  flags  will  oe  removed  from  the  last  active 
CFM,  time stamped  for  termination,  and  an  appropriate 
message  sent  to  the  TM  module  indicating  reason  for 
termination  and  location  of  any  existing  output. 

SOURCE/DESTINATION 


INPUT:  Map  Complete  Message 

OUTPUT:  Output  Location 

Termination  Message 


Traverse  Map  Structure 

(1.2.1) 
TM  module 
TM  mocule 


************************** 

NAME:  ASSIST  INTERRUPT  PROCESS 

IDENTIFIER:  1.3.2 

DESCRIPTION:     Tnis    function   will    resolve    interrupts 
originating    when   tne   active   controlling   functional 
moaule  requires  additional  data  concerning  user  desires 
or  system  requirements. 

SOURCE/DESTINATION 


INPUT:  CFM  Request 
User  Request 


CFM 
TM  module 
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OUTPUT:  User  Request:  TM  moaule 

CFM  Request  CFM 

***************************** 

NAME:  ASSIoT  RECOVERY  fkOCLSS 

IDENTIFICATION:  1.3.3 

DESCRIPTION:  ».nis  tunction  aids  the  recovery  Module  (RM)  oy 
providing  current  status  of  all  sessions  upon  request. 
The  restart  anci/or  recovery  is  concrolled  entirely  oy 
tne  RM  module. 

S^U  RCE/ JEST  1  NAT  I  ON 

INPUT:  Status  Request  RM  module 

OUTPUT:  Status  Response  RM  module 


The    following   cnart   shows   a   summary   of   possible 
commands  for  tne  Session  Services  module: 

INITIATE 

ACCEPT 

TERMINATE 

SEND 
RECEIVE 

CANCEL  SESSION-QUARANTINE-UNIT 

TERMINATE 

SESSION-INTERACT ION-UNIT 


PROCESS  RECOVERY  UNIT 


COMMIT       \  PROCESS-COMMITMENT-UNIT 

RECOVER 


Table  3 

Tne   following   data  flow  cnart  is  an   example   of   tne 
system  activity  tnat  occurs  at  the  lower  level.   It  is  useful 
both  as  a  different,   pictorial  view  of  tne  interaction,  and 
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as  a  tool  wnen  mapping  tne  data  to  the  functions 

LOW  LEVEL 
DATA  FLOW  DIAGRAM 


1.  1.  1 


ASSIST 
INTERRUPT 
PROCESS 


ASSIST 

RECOVERY 
PROCESS 
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1  V  .  SESSION  S  L  K  V  I  C  £  S  DATA 

The   next   step  is  to  define  tne  uata  used   witnin   the 

scope  of  tne  system.  At  first/  this  is  done  oy  isolating  chc 

view  to  data  only,   ignoring  any  relationsnip  to   functions. 

3ecause   many  misconcept ions  about  definitions  of  tcr:;s   ma / 

exist   anu   .nay   detract  from   the   document's   readability, 

several     significant     terms     associated     witn     the 

interpretations   of   data  used  in  this  document  are   defineu 

oeio^ : 

DATA  CLASS  -  a  collection  of  data  used  to  describe  an 
entity  wnicn  is  easily  describable,  readily  comprehend ibl= 
and  meaningful  (witnin  the  system  boundaries),  past  tne 
point  of  just  oeing  a  collection  of  data  elements.  Usually 
they  can  be  uniquely  identified,  and  have  unique  data 
elements  associated  with  them.  Basically,  two  types  are 
distinguishable:  objects  and  concepts. 

OBJECT  -  a  particular  occurrence  of  a  data  class  which 
exists,   and  is  capaole  of   oeing  sensed. 

CONCEPT  -  an  effort  or  action  in  tne  real  world.  An 
example  of  some  typical  concepts  might  make  tne  distinction 
clearer.  A  "bani;  account"  may  well  be  a  concept  data  class, 
it  is  uniquely  defined  by  an  account  numoer,  has  meaning 
witnin  tne  oanking  industry,  yet  you  cannot  toucn  it.  it 
exists  merely  as  an  agreement  oetween  the  customer  and  the 
bank.  It  can  be  represented  or  referred  to  .atn  listings  of 
account  status,  or  inquiry,  but  cannot  oe  seen  or  sensed. 

INDICATIVE  KEY  -  the  attribute  which  enaoles  an  object 
or  concept  witnin  a  data  class  to  be  uniquely  identifiea.  An 
indicative  key  points  to  one  and  only  one  occurrence  of  a 
data  class. 

XREF  KEY  -  an  attribute  used  to  show  a  relationsnip  of 
one  data  class  to  another  data  class.  Tne  physical 
implementation  may  appear  as  a  link  record  or  pointer. 

Each  data  class  is  described  in  a  form  consisting  of   a 
data  description,   i<ey  elements,   and  relationships  to  other 
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data  classes.  Selection  of  the  data  class  name  snould  oe 
consistent  with  tne  usage  within  tne  system  Deing  modelea. 
That  is,  the  users  of  tne  system  snould  not  oe  forced  to 
ciiange  their  normal  vocabulary.  The  iaen  1 1  f  ica  1 1  on  of  the 
keys  are  nelpful  later  in  the  development  of  the  optimum 
data  s tr uc  ture . 

The  definition  of  raw  data  for  the  system  is  completed 
when  this  step  is  completed.  Furtner  refinements  and 
subsequent  information  may  require  any  of  the  steps  to  oe 
re-evaluated  and  altered. 

DATA  CLASb  DESCRIPTIONS 
************************* 

NAME:  SESSION  -  Describes  the  time  frame  that  represents  a 
user  requesting  a  task  from  the  SPLICE  system  and 
receiving  a  response  to  tne  request.  Tne 
oeginning  is  marked  by  tne  Session  Services 
Module  after  it  nas  received  clearance  for  a 
particular  user  for  a  particular  functional 
module.  Tne  end  is  marked  by  the  Session  Services 
Module  when  an  error  has  forced  an  abend  status 
or  tne  functional  map  nas  oeen  completed,  and 
control  is  returned  to  the  user  via  the  Terminal 
Management  Module. 

ELEMENTS: 

Session  Number  (Indicative  key) 
Timestamp 
Functional  Map 

Controlling  Functional  Modules 
Current  Controlling  Functional  Modules 
User  Identifier  (XREF  to  User) 
Terminal  Identifier  (XREF  to  Mailbox) 
Service  Request  Code  (XREF  to  Task  ) 
Status  Code  (Inactive,   suspend,   resume,  I/O 
wai  t  etc  . ) 
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****************************** 

NAME:  CASK   -    Joscrioes   eacn   joe  tnat  :;i:  SPLICL   system 

aliows  tne  user  to  request  from  any  work   station 

in   tne   system.    It   may   re  la  to  to   a   single 

Functional     Module,     call^u    a  Controlling 
cunccional   Moaule,   or   it  may  oe  a   napping   of 

several    modules    containing    a  Controlling 


ELEMENTS  :- 

Service  Request  Coce  (Indicative  Key) 

User  Authority  Required 

Logical  Name  of  Module 

Logical  Location 

Pnysical  Location 

Parameter  Specification 

******************************* 

NAME:  MAILBOX  -  A  method  of  uniquely  identifying 
repositories  of  data  from  deferred  tasks.  It  also 
identifies  tne  nature  of  tne  service  requestec 
for  those  users  with  multiple  mailboxes. 

ELEMENTS: 

Mailbox  Number  (Indicative  key) 

Command  Location 

Log  ica 1  Address 

Physical  Address 

Hardware  Configuration 

Clearance  Level 

Constraints 

************************************** 

NAME:  USER  -  A  person  or  group  of  persons  wno  are 
authorized   access   to   the   SPLICE   system. 

ELEMENTS : 

Password  (Indicative  <ey) 

Log-on 

User  Name 

Parent  Command 

Authorization  Level 

Terminal  Restrictions  (XREF  to  Mailbox) 
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V.  y.APPlNG  OF  DATA  TO  FUNCTIONS 
Tne   logical   design,   based  on  function  and   data   was 
presented  in  the  previous  section.   t'ucure  wocr;  in  this  area 
will  involve  mapping  tne  data  to  the  functions,   tnat  is,  we 
will  determine  which  functions  use  which  uata. 

The  data  are  categorized  with  respect  to  the  role  tney 
play  for  tnat  function  as  well.  Data  can  be  a  trigger  that 
causes  the  function  to  occur;  it  can  be  input  from  anotner 
function,  or  outside  source;  it  can  be  input  from  the  data 
repository;  it  can  be  an  output  whicn  triggers  another 
function;  it  can  be  output  to  a  data  repository;  or  it  can 
be  output  to  anotner  function  or  outside  destination. 

After  data  nave  been  mapped  to  the  individual 
functions,  a  determination  is  made  of  what  transformations 
occur  in  tne  system  is  made.  A  transformation  is  an  action 
tnat  changes  tne  input  data  and  creates  or  helps  to  create 
an  output.  Using  input/output  data  defined  in  the  function 
description,  values  for  the  quantities  of  each  particular 
input  and  output  can  De  used  to  determine  tne  numbers  of 
occurrences  of  these  transactions.  The  quantities  may 
adaress  data  class  aostractions ;  therefore,  the  data 
dictionary  may  have  to  be  used  to  cross  reference  data 
elements  tnat  are  at  tne  particular  level  identified  in  tne 
transformations . 

This  information  can  be  used  to  grapn  the  number  of 
occurrences  of  transforms  and  relate  this  information  to  tne 
required    data   relationships.    Using   this    grapn,    the 
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transf  ormations  that  most  frequently  occur  within  the  system 

jrc  ij_ncuiju.   This  grapn  can  tnen  be  used  to  decide  wnich 
Logical   data  relationships  to  implement. 
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VI.  SESSION  SERVICES  INTERFACES 

The   Session  Services  Module  has  no  interface   external 

to  the  SPLICE  system.  The  user  and  other  systems  only  access 

the  session  services  module  via  other  SPLICE  modules.   There 

are  major  interfaces  which  must  be  defined  within  the  SPLICE 

system.   The  following  interfaces  are  addressed  on  a  logical 

basis  without  identifying  the  actual  implementation  details: 

SESSION  SERVICES  TO  TERMINAL  MANAGEMENT  MODULE 

Rather  than  communicate  directly  with  user  terminals, 
the  functionality  is  separated  between  Session  Services  and 
Terminal  Management.  The  end  user  always  communicates  with 
the  Terminal  Management  Module  and  the  Terminal  Management 
Module  always  communicates  with  Session  Services  or  the 
user.  The  TM  sends  session  requests  from  a  particular 
mailbox  or  interactive  user  to  Session  Services.  The  TM 
receives  the  completion  message,  tne  output  location,  the 
rejection  message,  and  the  information  request  message  from 
session  services.  The  TM  will  provide  all  necessary  message 
editing,  screen  management  and  virtual  terminal  operating 
functions . 

SESSION  SERVICES  TO  RECOVERY  MODULE 

Session  Services  is  a  support  module  to  the  Recovery 
Module.  When  tne  Recovery  Module  requires  data  concerning 
the  current  or  recoverable  state  of  all  or  any  session,  the 
request  will  be  addressed  to  Session  Services.  Session 
Services  will  access  the  required  data  and  forward  them  to 
the  Recovery  Module.  No  recovery  logic  is  contained  in  tne 
Session  Services  Module. 

SESSION  SERVICES  TO  DATA  DICTIONARY/DIRECTORY 

When  Session  Services  requires  data  concerning  user 
authority,  task  security  level,  task  mapping  structures, 
data  structures,  or  functional  module  relationships,  it  will 
request   that   data   from  the  Data  Dictionary/Directory. 

SESSION  SERVICES  TO  NATIONAL  COMMUNICATION  MODULE 

During   a   session,    the   control   is   passed   to   the 

Controlling    Functional    Modules   via   the    logical    bus 

mechanism.   If   the  logical  name  refers  to  a   remote   module 

then   it   is  picked  up  by  the  National  Communication   Module 

(NC)   and   initiated  for  transmission  on   the   DDN.   The   NC 
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Module  will  convert  LAN  protocol  to  Defense  Data  Network 
protocol  and  vice  versa,  enabling  session  services  to  rely 
only  on  the  local  LAN  protocol. 

SESSION  SERVICES  TO  LOCAL  COMMUNICATION  MODULE 

During  a  session,  control  is  passed  to  the  Controlling 
Functional  Module  via  the  logical  bus  mechanism.  If  the 
logical  name  relates  to  a  module  contained  in  the  LAN,  then 
it  is  picked  up  by  the  Local  Communication  Module  and 
ini  t iated . 

SESSION  SERVICES  TO  SESSION  STATE  DATA 

Session  Services  is  the  only  module  able  to  access 
session  state  data.  Tne  Session  Services  Module  will 
maintain  state  information  in  its  own  file.  These  data 
should  be  structured  so  that  no  other  module  may  access 
them,  and  more  critically,  alter  them. 

SESSION  SERVICES  TO  FUNCTIONAL  MODULES 

There  is  no  interface  between  Session  Services  and 
functional  modules,  other  than  a  Controlling  Functional 
Module.   All  called  functional  modules  interface  solely  with 

the  CFM. 


In  general  the  interfaces  do  not  require  any  additional 
information  to  be  added  to  the  LAN  message  format  presented 
in  (Schneidewind ,  1982).  The  data  that  are  contained  in  the 
LAN  message  format  include: 


MESSAGE  TYPE 

DATE  AND  TIME 

DESTINATION  ADDRESS 

DESTINATION  LOGICAL  ADDRESS 
DESTINATION  PHYSICAL  ADDRESS 

SOURCE  ADDRESS 

SOURCE  LOGICAL  ADDRESS 
SOURCE  PHYSICAL  ADDRESS 

NUMBER  OF  FRAGMENTS 

MESSAGE  NUMBER 

FRAGMENT  NUMBER 

ACKNOWLEDGEMENT  FRAGMENT  NUMBER 

DATA  LENGTH 

SERVICE  REQUEST  CODE 

DATA 

ERROR  CHECK 

FLAGS 
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All  fields  are  fixed  length  except  for  the  field  called 


DATA 
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VII.  DESIGN  RECOMMENDATIONS 

Our     major    design    recommendation     concerns  the 

implementation   of   fault   tolerant   techniques,    which  is 

covered   in   some   detail   because  of  the   general   lack  of 
awareness   concerning   fault   tolerant   methods.   The   other 

recommendations   address   possible  problem  areas  related  to 
tne  nature  of  the  application  environment. 

A.    FAULT  TOLERANCE 

The  notion  of  incorporating  a  means  for  tolerating 
faults  in  order  to  improve  computer  system  reliability  is 
well  established.  Because  of  the  usual  multiple  meanings 
of  data  processing  terms,  we  will  attempt  to  define  several 
terms.  Fault  tolerance  is  a  term  describing  a  system  capable 
of  coping  with  faults  without  a  requirement  for  manual 
intervention.  Fault  prevention,  a  similar  concept,  avoids 
potential  faults  by  detecting  and  eliminating  them  prior  to 
operating  the  system.  Once  the  system  is  operational,  fault 
tolerance, if  it  exists,  begins.  The  SPLICE  session  services 
module  should  employ  fault  tolerance  to  improve  reliability. 

For  a  system  to  be  fault  tolerant,  it  must  be  able  to: 
detect  errors,  assess  and  confine  the  damage,  and  repair  or 
recover  from  the  error  without  forcing  manual  intervention. 
Detection  methods  are  most  abundant.  Assessing,  confining, 
and  recovering  methods  are  not  well  defined  nor  readily 
avai lable . 

Unmastered  complexity  at  any  level  increases  the 
possibility  of  a  fault  occurring.  It  is  highly  unlikely  that 
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the  SPLICE  system  will  ever  be  fault  free.  A  major  part  of 
the  complexity  in  SPLICE  will  be  implemented  in  software. 
Therefore,  software  developers  should  seek  not  to  design 
"perfect,  error  free"  software,  but  rather  attempt  to 
provide  reliable  fault  tolerant  techniques  for  SPLICE  in  all 
modules . 

Any  definition  of  tne  reliability  of  a  system  must 
involve  distinguishing  between  acceptable  and  unacceptable 
benavior  of  the  system.  There  must  be  some  method  to 
distinguish  between  unsatisfactory  behavior  which  is  a 
consequence  of  user  misunderstanding  and  unacceptable 
behavior  due  to  deficiencies  in  the  system  itself.  The 
system  specification  should  provide  that  mechanism. 

Ideally,  a  specification  should  be  consistent, 
authoritative  and  so  complete  that  the  behavior  of  the 
system  is  defined  for  all  possible  input  and  output 
sequences.  In  each  circumstance  where  this  is  not  so, 
acceptable  behavior  cannot  be  distinguished  from 
unacceptable  behavior.  Regardless  of  the  specification,  only 
two  concepts  are  essential  to  understand  the  causes  of 
failure:  (1)  an  event  (state)  which  should  not  have  occurred 
and  (2)  a  condition  (state)  which  should  not  have  arisen. 
Internally,  these  are  usually  referred  to  as  erroneous 
transitions  and  erroneous  states.  If  either  of  these 
internal  situations  exists,  then  it  follows  that  the  system 
has  had  a  component  or  design  failure. 
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"An  erroneous  transition  of  a  system  is  an  internal  state 
transition  to  which  subsequent  failure  could  be  attributed. 
Specifically,  there  must  exist  a  possible  sequence  of 
interactions  which  would,  in  the  absence  of  corrective 
action  from  the  system,  lead  to  a  system  failure 
attributable  to  the  erroneous  transition." 

"An  erroneous  state  of  a  system  is  an  internal  state 
which  could  lead  to  a  failure  by  a  sequence  of  valid 
transitions.  Specifically,  there  must  exist  a  possible 
sequence  of  interactions  which  would,  in  the  absence  of 
corrective  action  by  the  system  and  in  the  absence  of 
erroneous  transitions  lead  from  the  erroneous  state  to  a 
system  failure."  (Anderson  and  Lee,  1983) 

Design  faults  are  unpredictable  and  unexpected. 
Unanticipated  errors  are  the  result.  Experience  from 
previous,  identical  or  similar  faults  is  usually  not 
helpful  in  resolving  tne  design  fault. 

If  these  ideas  are  acceptable,  then  it  is  not  difficult 

to   determine   why   a  major  portion  of  the  effort   in   fault 

tolerance   has  been  aimed  at  component  failure  to   date.   To 

illustrate   the  point,   make  this  trivial  comparison   of   an 

easy  component  failure  and  an  easy  design  failure: 

COMPONENT  -  A  diode  fails,  causing  an  open  circuit. 
It  was  predictable  since  diodes  only  last  so  many 
hours  in  certain  environments.  The  result  of  a 
continually  open  circuit  is  predictable  and  can  be 
anticipated.  It  is  repaired  in  the  same  manner  as 
tne  last  diode  failure. 

DESIGN  -  A  logic  circuit  fails  to  produce  the  desired 
results.  The  desired  results  are  to  exchange  the 
values   of   variable   Ai   and  Aj   without   using   an 
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auxiliary  variable. 

The  algoritnm  used: 
Ai  :  =  A j 
Aj  :  =  Ai 

An   obvious   design  failure  causes  the  value  Ai   to   be 
lost. 

To  correct  the  algorithm  the  following  adjustment  may 
be  made: 

Ai  :  =  Ai  -  A j 

A  j  :  =  A j  +  Ai 

Ai  :=  A]  -  Ai 
The  problem  did  not  go  away;  it  just  changed  and  became 
more  difficult  to  find.  Now,  it  works  for  most  cases; 
however,  this  algorithm  will  fail  when  i  =  j.  The  resultant 
error  may  not  always  be  the  same.  The  repair  of  the  error 
may  require  different  procedures  for  each  error,  therefore, 
the  experience  of  past  identical  or  similar  failures  may  not 
be  useful. 

Techniques  such  as  top  down  development,  structured 
programming,  step-wise  refinement,  and  information  hiding 
all  embrace  the  principle  of  divide  and  (nope  to)  conquer. 

Some  of  the  problems  of  software-oriented  fault 
tolerance  are  unique  to  the  errors  found  in  software  design. 
There  have  been  two  methods  suggested  for  providing  software 
with  fault  tolerant  characteristics  which  address  those 
unique  qualities  which  make  software  fault  tolerance  so 
difficult.  The  methods  are  called:  "Recovery  block  scheme", 
and    "N-version   programming".    Both   operate   under    the 
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assumption  that,  despite  tne  use  of  fault  prevention 
techniques,  any  complex  software  system  will  always  contain 
residual  faults  when  placed  in  service. 

Tne  recovery  block  scheme  was  designed  as  a  method  to 
provide  fault  tolerant  capabilities  to  sequential  programs. 
The  application  module  is  called  the  primary  module  and  is 
defined  as  a  non-redundant  software  module  which  has  been 
designed  and  implemented  to  satisfy  the  authorizing 
speci  f  icat ion . 

The  first  stage  of  the  process  is  to  detect  when  an 
error  arises  during  the  execution  of  the  "primary  module." 
Since  the  primary  module  has  been  debugged  and  tested  as 
much  as  practicable,  the  module  may  run  several  times 
without  any  error  conditions.  There  are  advantages  and 
disadvantages  to  placing  an  error  detector  within  the  code 
itself.  The  error  detection  module  should  be  separate  and 
executed  immediately  after  the  execution  of  the  "primary 
module,"  before  any  data  are  transferred  and  used  in  a 
subsequent  module.  This  "acceptance  test"  consists  of  a 
sequence  of  statements  which  will  raise  an  exception  if  the 
state  of  the  system  is  not  acceptable.  The  primary  module 
will  have  failed  if  any  exception  is  raised. 

The  second  stage  is  to  assess  the  damage  and  repair  or 
recover.  This  function  will  require  a  recovery  point 
mechanism  to  prevent  rerun  of  entire  modules  or  groups  of 
modules.  Therefore,  at  the  beginning  of  each  primary  module, 
tnere   will  be  a  sequence  of  statements,   or  some  method   of 
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indicating   the  status  of  the  machine   environment.   Various 

methods   are  in  wide  use  today  and  may  be  applied.   When  the 

error  occurs  and  is  detected  by  the  "acceptance  test,"  there 

must  be  an  available  alternate  module  to  process.  If  we  were 

to   recover   and  repeat  the  primary  module   under   the   same 

conditions,   we  would  expect  the  error  to  recur  and  we  would 

have   an  endless  cycle  of  execution.   The  alternate   modules 

are   logically   capable  of  producing  the  required  output   as 

defined   by   the   "acceptance   test."    They   may   be    less 

efficient  with  respect  to  memory  and/or  speed,   however,   or 

be   designed   in  a  less  complicated  fashion.   Regardless   of 

their   weaknesses,   their   strength  is  that  they   have   less 

probability   of   design   error.   Perhaps   several   alternate 

modules   are   developed,   each   less   complicated   than   the 

previous.   This   alternate   module   is,   in   fact,   built-in 

redundancy   and  reflects  the  degree  of  reliability   required 

of  the  system.  The  alternate  modules  are  invoked   only  if  an 

error   occurs,    thus   minimizing   the   overhead.    Software 

monitors   can   be  used  to  record  activation   occurrences   of 

these   alternate  modules.   Run  time  overhead  is  incurred  for 

each   recovery   point  established  and  for  each  call   to   the 

acceptance  test  module.  The  recovery  block  logic  appears  as: 

ESTABLISH  RECOVERY  POINT 

EXECUTE  PRIMARY  MODULE 

APPLY  ACCEPTANCE  TEST  TO  RESULT 

INVOKE  ALTERNATE  MODULE  IF  FAILURE  EXISTS 

APPLY  ACCEPTANCE  TEST  TO  RESULT 

INVOKE  ALTERNATE  MODULE  IF  FAILURE  EXISTS 

The  usual  syntax  associated  with  the  scheme  is: 
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ENSURE  (acceptance  test  criteria> 
BY  (primary  module  name) 
ELSE  BY  (alternate  module  #1> 
ELSE  BY  (alternate  module  #2> 
ELSE  ERROR 


Tnese  blocks  can  be  nested  and  do  not  impose 
constraints  on  the  programming  style  or  the  methodology 
being  used.  They  are  compatible  witn  structured  programming 
techniques  and  high  level  languages. 

In   a   sort  application,   the  fault   tolerant   recovery 

block  might  appear  as: 

ENSURE   A[j+1]  >=  A[j]  for  j=l,2,.  .  .n-1 

BY   Sort  A  using  quicksort 

ELSE  BY   Sort  A  using  shell  sort 

ELSE  BY   Sort  A  using  insertion  sort 

ELSE  ERROR 

Where  quicksort  is  a  time-efficient,  slick,  compactly- 
coded  module;  shell  sort  is  less  efficient,  but  easier  to 
visualize  and  less  prone  to  design  error;  and  insertion  sort 
is  a  simple  brute-force  type  of  sort  routine. 

Fail   soft  approaches  also  use  this  technique  by  having 
each   alternate   module  provide  some  reduced   service   or   a 
subset  of  the  required  output.    The  following  is  an  example 
of    a    disk-to-memory  transfer   function   which   is    being 
gracefully  degraded: 

ENSURE  consistency  of  disk  transfer  queue 
BY   enter  request  in  optimal  queue  position 
ELSE  BY  enter  request  at  end  of  each  queue 
ELSE  BY  send  warning  message  x request  ignored' 
ELSE  ERROR 

While   the   above  example  may  cause   problems   for   the 

program  requesting  the  transfer,   tne  rest  of  the  system  can 

proceed  without  disruption. 
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Control  flow  of  the  recovery  process  can  be  directly 
supported  by  microcode,  or  by  the  use  of  special  purpose 
instructions  which  are  tailored  to  suit  recovery  block 
schemes.  The  Anderson  and  Kerr  report  describes  an 
experimental  architecture  which  provides  this  support  in  a 
recovery  cache  mechanism. 

So  far,  it  has  been  assumed  that  exceptions  occurring 
during  the  execution  of  a  recovery  block  program  will  result 
in  backward  error  recovery  and  a  transfer  of  control  to  the 
next  alternate  module. 

There  is,  however,  nothing  to  prevent  a  forward  error 
recovery  technique  within  a  module  of  a  recovery  block.  An 
application  of  this  concept  might  be  used  in  the  detection 
of  an  underflow  occurrence.  An  error  handler  could  insert 
the  lowest  value  number  used  in  the  machine  and  continue  to 
process;  then  flag  the  result  with  a  note  to  the  user  that 
the  substitution  was  made.  In  this  case,  it  is  more 
efficient  than  finding  a  recovery  point  and  running  an 
alternate  module.  Other  situations  exist  which  favor  forward 
error  recovery  as  well. 

Tne  recovery  block  scheme  is  conceptually  simple.  Tne 
implementation  may  not  be  as  simple.  Alternate  modules 
require  extra  coding  and  require  more  complicated  module 
interfaces.  This  added  complexity  has  not  been  quantifiable 
to  date.  This  is  true  because  of  the  lack  of  total  testing 
results  and  the  lack  of  use  in  multiple  application  areas. 
Because   alternate  modules  are  independent  of   one   another, 
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the  implementation  of  multiple  alternate  modules  is  not  much 
more  difficult  than  just  one  alternate  module,  if  previous 
modules  exist.  This  could  be  very  expensive  if  all  modules 
were  developed  from  scratcn;  therefore,  a  good  source  of 
alternate  modules  is  found  in  the  prior  versions  of  a 
module.  For  example,  as  updated  replacement  modules  (with 
optimized  features,  or  refined  functions)  are  ready  to  be 
implemented,  the  previous  module  may  be  used  as  the  first 
alternate. 

The  acceptance  test  procedure  adds  a  dimension  of 
complexity  that  is  not  present  in  non-fault  tolerant 
systems,  although,  for  every  set  of  alternate  and  primary 
modules,  this  acceptance  test  need  only  be  designed  and 
implemented  once.  In  the  SPLICE  application,  this  module 
should  not  be  unacceptably  large  or  complex.  It  should  be 
noted  that  SPLICE  will  use  a  Tandem  Computer  system  whicn  is 
designed  to  provide  redundant  hardware  and  software  modules 
for  all  major  functions. 

The  second  method  for  implementing  software  fault 
toleration  is  conceptually  simpler.  The  N-Version  approach 
to  fault  tolerance  has  N  versions,  where  N  is  greater  than 
1,  of  independently  designed  code  which  satisfies  a  single 
specification.  All  the  versions  are  executed  and  all  the 
results  are  compared.  The  correct  (majority)  response  is 
sent  to  its  destination  while  the  erroneous  responses,  if 
they  exist,  are  ignored. 
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The  implementation  of  the  scheme  requires  a  driver 
program  to  invoke  eacn  version,  collect  each  response, 
compare  the  results,  and  determine  the  correct  response. 
Lach  version  operates  atomically  and  uses  the  same  input 
space,  so  the  driver  must  synchronize  the  execution  of 
versions.  The  originators  of  this  approach,  (Chen  and 
Avizienis,  1981),  use  a  handshake  method  with  "wait"  and 
"send"  primitives  to  insure  that  only  one  version  at  a  time 
is  running.  This  scheme  prevents  realizing  the  advantages  of 
pipelining,  or  parallel  processing.  It  also  requires  a 
timeout  detection  capability  to  prevent  infinite  loops. 
An  advantage  of  the  N-version  technique  is  that  damage 
assessment  would  not  have  to  be  done  at  all,  and  error 
recovery  is  done  by  simply  ignoring  the  erroneous  answer.  An 
obvious  disadvantage  on  the  other  hand  is  the  overhead  of 
atomically  executing  each  version  with  the  same  input  space, 
and  executing  a  voting  check  to  obtain  a  single  result. 

There   are   few,   if  any,   theoretical  reasons  for   not 

adopting  fault  tolerance  techniques.   It  seems  that  the  real 

factors  preventing   widespread  acceptance  are: 

o  Education:  Lack  of  courses,  books,  and  quantifiable 
evidence  of  improved  reliability,  make  it  a 
difficult  idea  to  sell  to  managers. 

o  Psychological :  Adopting  these  schemes  is,  in  some 
way,  admitting  the  existence  of  design  faults. 
Therefore,  the  designers  and  programmers  do  not 
have  motivation  to  persuade  the  managers. 

o  Cost :  It  is  perceived,  by  project  managers,  as  an 
additional  cost  without  additional  functionality. 
Without  the  necessary  motivation,  they  are  unable 
or  unwilling  to  justify  the  development  or 
runtime  overhead  involved. 
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Since  a  great  deal  of  money  will  be  spent  on  SPLICE 
software  (development  and  maintenance) ,  producing  fault 
tolerant  software  may  be  the  most  advantageous  approach  from 
a  systems  life  cycle  cost  perspective.  Again,  we  note  that 
hardware  fault  tolerance  is  provided  by  the  SPLICE  Tandem 
system.  However,  this  feature  is  of  little  help,  if  there 
are  errors  in  applications  software. 
B.    Jeopardizing  Coupling  and  Cohesion  Principles 

A  designer  should  not  attempt  to  excessively  enlarge 
functional  modules  in  an  effort  to  reduce  overhead.  This 
must  be  a  carefully  considered  option.  If  multiple  functions 
are  combined  or  extra  data  are  passed  to  prevent  another 
call,  then  the  modular  characteristics  begin  to  deteriorate. 
This  may  result  in  a  very  expensive  and  time  consuming 
maintenance  portion  of  the  life  cycle.  Controlling 
functional  modules  should  be  capable  of  coordinating  an 
entire  transaction.  For  a  single  controlling  module  to 
coordinate  an  entire  transaction,  it  is  required  that  it 
contain  all  logic  necessary  to  perform  that  transaction. 
System  designs  which  dictate  that  coded  modules  be  as 
cohesive  as  practical  and  that  coupling  interfaces  be  kept 
simple,  are  usually  easier  to  maintain.  It  is  far  better, 
at  least  in  the  design  phase,  to  split  out  functions  in  the 
widest  pattern.  More  breadth  in  design  eases  the  task  of 
maintaining  coupling  and  cohesion  principles,  while  not 
adversely  impacting  the  implementation.  If,  for  example, 
after   sizing   the  transactions,   it   appears   that   control 
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cannot  be  achieved  efficiently,  it  is  possible  to 
systematically  combine  the  modules  witn  the  least  negative 
impact  and  the  most  benefit. 

C.  Use  of  ADA  Program  Unit  Specifications 

The  use  of  ADA  as  a  program  design  language  would  serve 
to  enhance  the  SPLICE  software  design  process.  The  use  of 
ADA  program  unit  specifications  during  design  can  be 
implemented  in  either  ADA  or  Pascal  easily.  Additionally, 
this  systematic  approach,  will  allow  this  design  to 
interface  more  easily  with  future  DOD  supported  projects  in 
the  logistics  environment.  The  ADA  Program  Support 
Environment  (APSE)  could  have  a  long  learning  period 
associated  with  it.  To  begin  using  it  now  would  improve 
FMSO's  long  term  position  with  respect  to  the  Navy's  support 
of  DOD  policy  and,  at  the  same  time,  provide  a  good 
environment  in  which  to  learn  it  with  relatively  low  risk. 
APSE  would  provide  configuration  control  of  the  functional 
allocation  of  each  program  unit. 

D.  Naming  of  Modules 

The  naming  of  unique  modules  will  be  difficult  to 
achieve  because  of  the  widespread  adoption  of  local  unique 
programs  at  each  of  the  stock  points.  Many  stock  points  have 
personalized  the  reorder  process,  the  inventory  process,  and 
many  management  reports.  In  some  stock  points,  local  unique 
programs  form  the  basis  of  the  MIS  used,  instead  of  the  FMSO 
provided  UADPS-SP.  Local  programs  may  nave  the  same  names 
as   modules  at  another  stock  point;   they  may  also  have   the 
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same  name  and  function  as  a  module  from  UADPS-SP.  Because 
most  stock  points  believe  they  have  unique  requirements  or 
because  the  wait  to  get  requested  UADPS-SP  ennancements  is 
excessively  long,  they  have  created  a  number  of  local-unique 
programs  that  enhance  UADPS-SP  or  in  some  cases  replace  it. 
Transaction  Item  Reporting  (TIR)  notices  from  the  stock 
point  to  the  ICPs  for  ICP-control led  items  is  largely  done 
with  stock  point  local-unique  programs. 

This  is  a  critical  problem  to  solve  before  the  design 
of  Session  Services  or  any  other  module  goes  too  far.  If  the 
local  unique  programs  cannot  be  standardized  for  all  stocx 
points,  or  cannot  be  placed  in  a  global  library  with  unique 
names  and  parameters  identified,  the  entire  level  of  control 
and  ability  to  share  data  and  processes  envisioned  by  SPLICE 
proponents  is  at  risk. 
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VII .  SUMMARY 

The  Session  Services  Module  should  contain  functions  as 
described  in  section  III  with  the  the  data  described  in 
section  IV  as  a  support  structure.  The  interfaces  described 
in  section  VI  must  be  kept  in  mind  during  the  detail  design 
phase  of  the  project. The  major  design  recommendation  offered 
in  this  paper  is  the  fault  tolerant  concept.  Design 
recommendations  to  prevent  large  modules  which  hinder  the 
coupling  and  cohesion  characteristics  of  modules  and  the  use 
of  ADA  program  unit  specifications  are  mentioned  because  the 
SPLICE  environment  may  be  well  served  by  their  use.  The 
recommendation  to  standardize  the  local  unique  program  names 
is  discussed  in  an  effort  to  prevent  the  serious  situation 
that  would  result  if  this  aspect  of  the  SPLICE  project  were 
ignored.  The  stock  points  which  will  be  served  by  this 
system  have  a  long  history  of  addressing  local  requirements 
with  locally  designed  and  developed  software  whicn 
interfaces  with  FMSO-provided  software.  All  of  the  stock 
point  requirements  must  be  examined  carefully  and  a  decision 
that  is  mutually  agreeable  between  designers  and  users  must 
be  made  concerning  what  parts  are  to  be  standard  and 
unmodified,  and  which  parts  are  to  be  changed  by  local 
adaptations.  The  eventual  use  or  misuse  of  the  system  could 
be  determined  by  this  decision. 
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